Ansible SSH Key Auth Setup Guide: Config

ansible ssh key authentication setup guide

Quick Summary
Prerequisites for the Control Node and Managed Hosts
The Ultimate Ansible SSH Key Authentication Setup Guide
Security Hardening and Privilege Escalation
Debugging Stale Known_Hosts and StrictHostKeyChecking Failures
Frequently Asked Questions

The initial time I experienced an Ansible playbook freeze up, I was aghast when I finally tracked down the root cause as being the prompt for a password and could not use an automation tool that did not require me giving out passwords, and it was late and I was getting “Host key verification failed” due to a result of an elaborate chain of unfortunate events. At this moment I decided to put together a very reliable basis for connectivity. I rebuilt the entire connection process from scratch, using key based authentication, and I can assure you that I have not had a problem with it since. The following document will assist you in setting up your own ansible ssh key authentication configuration, so that you can abandon fighting with ssh and begin to trust in your control node for your managed hosts.

Quick Summary

First, we will create a new Ed25519 keypair and add the public portion to the authorized_keys file of every managed host.
Next, we will establish a locked down path to the private key in the ansible.cfg so that Ansible will not guess the location of the private key.
We will configure ssh-agent to store passphrase protected keys to allow the automated user to use those keys without having the need to retype the passphrase.
We will configure sudo to run commands without a password.
We will eliminate the issue of known_hosts file and the forever troublesome “Permission denied (publickey)” message.

Prerequisites for the Control Node and Managed Hosts

Verifying the ansible_user Permissions

To begin, a dedicated username for the automation user will need to be configured on each managed host, which is generally known as automator.You should verify that you can create a secure shell (SSH) into the user account on the control node through a password before you are able to add your public SSH key. If the user does not already exists you can create it by using useradd -m automator and assign them a temporary password.

For each host machine that you want to manage with Ansible, you will want to ensure that the home directory and the .ssh directory have the correct permissions (0700). A cautionary word of advice, if the /home/automator directory is world-writable (not accessible by any user), sshd will fail to authorise your key; after you add your key use the chmod 700 ~/.ssh and chmod 600 ~/.ssh/authorized_keys commands to correct the permissions. This permission issue is the cause of over 90% of the common “key ignored” issues that Ansible users experience.

Network Port and Firewall Readiness

Ansible communicates with managed nodes through TCP port 22 by default. If you have a host based firewall (UFW, firewalld or cloud security group), be sure that the port 22 is allowed through from the control node IP to the managed node. You can quickly test if the managed node can be reached by issuing the command nc -zv managed-node.example.com 22. If you are working with a host within a corporate VPN or home lab that does not use TCP port 22 you will specify that port with the ansible_port variable in your inventory.

The Ultimate Ansible SSH Key Authentication Setup Guide

Generating the Ed25519 or RSA Key Pair

I am always generating Ed25519 key pairs as they provide better performance and are as secure as RSA 4096. To generate an SSH key run this command on your control node (the system that has Ansible installed).

$ ssh-keygen -t ed25519 -C "ansible-control" -f ~/.ssh/ansible_ed25519
Generating public/private ed25519 key pair.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/you/.ssh/ansible_ed25519
Your public key has been saved in /home/you/.ssh/ansible_ed25519.pub
The key fingerprint is:
SHA256:qzD7LqKqR3Fk/9E1lHkGJmDZ0cJ0Vt2O4y9n5NhD1jk ansible-control
The key's randomart image is:
+--[ED25519 256]--+
|   .o. . .       |
|  . .o + . .     |
|   . o + . .     |
|    o +   .      |
|   . = +S  .     |
|    o.O +.. .    |
|     B o ..o     |
|    . E  . .o    |
|     .o.   ..    |
+----[SHA256]-----+

This command will generate a pair of keys named ansible_ed25519 (the private key), and ansible_ed25519.pub (the public key). The -C option will tag your public key with a comment so that you will know what the key is for later.I always protect the private half of my keys with a passphrase — I will send my passphrase to ssh-agent shortly.

Appending Public Keys to authorized_keys

Next, we must get your public key on all of the managed nodes’ authorized_keys files. The simplest method is ssh-copy-id, which will correctly handle permissions and prevent duplicates.

$ ssh-copy-id -i ~/.ssh/ansible_ed25519.pub automator@managed-node.example.com
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/you/.ssh/ansible_ed25519.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
automator@managed-node.example.com's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'automator@managed-node.example.com'"
and check to make sure that only the key(s) you wanted were added.

Once you’ve finished copying your keys to all the nodes, verify that you can log in with no password (or just the key’s passphrase). If you have lots of nodes, consider using a short bash loop to iterate through them all, or you could use a bootstrapping playbook to feed the temporary password in, as explained in ssh-copy-id documentation for scripting.

Defining the Private Key Path in ansible.cfg

When you use Ansible to connect to remote hosts, it needs to know which private key to use for each connection. You will create an ansible.cfg file in your project directory to specify the private key path and user.

[defaults]
inventory = ./hosts
host_key_checking = False
private_key_file = ~/.ssh/ansible_ed25519
remote_user = automator

[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
pipelining = True

The remote_user corresponds directly to the ansible_user you create for those servers, while the private_key_file locks ansible into using that specific key and never tries the default ~/.ssh/id_rsa. The host_key_checking=False is used here to enable fast first-time or handshake connections — more on that later. You may also set ansible_ssh_private_key_file per host in your inventory file, however, for a clean and consistent setup across all hosts it is recommended that you set a global fallback.

Why I Ultimately Chose This Route

Before choosing this approach, I tried several different configurations. The first thing I attempted was to use a key with no password protection. Although this was very simple to create, if it were to ever fall into the wrong hands I would lose root privileges to all of my managed nodes.Afterward, I experimented with host-based ‘ansible_ssh_private_key_file’ that was isolated in different inventory groups. However, once we added staging or ephemeral testing nodes, it quickly became an entangled mess. I found that pulling the private key path into one central ‘ansible.cfg’ file and just using the passphrase from ‘ssh-agent’ was the best balance of security with the lowest amount of friction for being automated. The extra ten minutes of setup has saved multiple hours of debugging in the future.

Security Hardening and Privilege Escalation

Configuring Passwordless Sudo for Ansible Execution

The ‘automator’ user will generally need to execute commands as root. Embed the password somewhere, and use the passwordless sudo capability with a dedicated drop-in file instead.

Execute the following: sudo visudo -f /etc/sudoers.d/automator on every managed node and add this line:

automator ALL=(ALL) NOPASSWD: ALL

This is the ultimate nuclear option; it allows the automation user to run any command without providing a password. If you would like to tighten things up even more, you could limit it to very specific commands (ex: /usr/bin/apt, etc). However, for the majority of infrastructure automation jobs, giving the automation user broad access is usually the trade-off. On the Ansible side, make sure you’re using :

become: yes
become_method: sudo

in your playbooks, as they perfectly compliment this sudoers entry.

Leveraging ssh-agent for Encrypted Private Keys

A passphrase protected private key will not work for automation unless unlocked once and therefore the credential remains cached.The ssh-agent keeps your decrypted keys in your computer’s memory while you are using it to connect to your servers.

In order to use your ssh-agent, you need to both start your agent, and add your keys to the agent using ssh-add.

$ eval $(ssh-agent -s)
Agent pid 14215
$ ssh-add ~/.ssh/ansible_ed25519
Enter passphrase for /home/you/.ssh/ansible_ed25519: 
Identity added: /home/you/.ssh/ansible_ed25519 (ansible-control)

Once the ssh-agent is running and has your key added to it, all of your ansible commands issued from that terminal, will use the ssh-agent un-locked key to authenticate. Note also, that you can also (as stated in the ssh-agent manual) forward an agent connection over ssh (-A), so if you are going through a remote access server (or bastion host), and need to jump multiple servers to reach your managed node, you can do so very easily. However, for a direct control node with no intermediate hosts, this method is sufficient. When you close the terminal, your ssh-agent dies with it, and your key will no longer be un-locked in memory; hence it will be safe again.

Debugging Stale Known_Hosts and StrictHostKeyChecking Failures

Fixing “Permission Denied (publickey)” Tracebacks

When you see this error when executing an Ansible command, it is usually due to one of three reasons: the private key does not match the public key in the remote managed node’s authorized_keys file, your key permissions are not restrictive enough as to prevent others from viewing them, or that SSH is not able to locate the key, at all. Use the verbose ping command below to see what actually happened:

$ ansible all -m ping -vvv
...
 ESTABLISH SSH CONNECTION FOR USER: automator
 SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/home/you/.ssh/ansible_ed25519"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=10 managed-node.example.com '/bin/sh -c '"'"'echo ~automator && sleep 0'"'"'' 
 (255, b'', b'automator@managed-node.example.com: Permission denied (publickey).\r\n')  <-- authentication rejected

Notice from the error message where the rejections occur. At this point you can test your ability to log into the managed node using an ssh command like so (an example is shown below).

If you are able to successfully log in to the remote managed node, then it is likely that Ansible is not picking up your ansible.cfg file (make sure to check your $ANSIBLE_CONFIG environment variable, or run Ansible from the directory where your ansible.cfg file resides).If you have trouble establishing an SSH connection, it could be due to one of two reasons. The first reason could be that the key fingerprint does not match the one stored within the node, and the second is that the SSH daemon configured with the option PubkeyAuthentication set to no in the file /etc/ssh/sshd_config has been used for the connection attempt.

Bypassing Host Key Verification for Ephemeral Nodes

When you create nodes as part of a cloud testing operation or a continuous integration (CI) running environment, and destroy them shortly thereafter, the SSH known_hosts file becomes stuffed with obsolete host key entries. To disable host key checking entirely using Ansible there are two options: you can either set an environment variable or modify the configuration file we executed earlier.

$ export ANSIBLE_HOST_KEY_CHECKING=False
$ ansible-playbook -i ephemeral_hosts deploy.yml

This is a great approach for short-term test nodes, however, do not leave this coding option on in production builds due to the fact that proper Host Keys from a CA-Provider are the best long-term solution.

Frequently Asked Questions

How do I use different SSH keys for different Ansible inventory groups?

Set the variable ansible_ssh_private_key_file in the group definition within the inventory file.

[web_servers]
web01 ansible_ssh_private_key_file=~/.ssh/web_key
web02 ansible_ssh_private_key_file=~/.ssh/web_key

[database_servers]
db01 ansible_ssh_private_key_file=~/.ssh/db_key

Ansible will use the group variable defined before using the Global variable defined under the private_key_file setting in the ansible.cfg. This allows the mixing of keys without modifying the parent configuration.

Why is Ansible ignoring my specific private key path?

Normally, the cause would be the use of either the command line option --private-key or an environment variable that will override your configuration, therefore your configuration may be correct, but Ansible will see the command line option and therefore will always prefer the CLI over any other setting. Use ansible-config dump | grep private_key to determine which private key Ansible will be using at the time of execution. Nine out of ten times there will be an old export statement that exists within your .bashrc file that will take over your current configuration.

Can I push SSH keys to managed nodes using an Ansible playbook itself?

Yes, you can do this by utilizing the built-in authorized_key module or module function.

- name: Push public key to automator user
  authorized_key:
    user: automator
    state: present
    key: "{{ lookup('file', '~/.ssh/ansible_ed25519.pub') }}"

The Ansible playbook would need to be run once using a temporary login/password through the -k option (prompt for password) and set to run as the user automator, or use an initial bootstrap user who has a password configured to allow for future executions with that key. This is the cleanest method for bootstrapping a new fleet of nodes.

0 10 9 minutes read

Quick Summary

Prerequisites for the Control Node and Managed Hosts

Verifying the ansible_user Permissions

Network Port and Firewall Readiness

The Ultimate Ansible SSH Key Authentication Setup Guide

Generating the Ed25519 or RSA Key Pair

Appending Public Keys to authorized_keys

Defining the Private Key Path in ansible.cfg

Why I Ultimately Chose This Route

Security Hardening and Privilege Escalation

Configuring Passwordless Sudo for Ansible Execution

Leveraging ssh-agent for Encrypted Private Keys

Debugging Stale Known_Hosts and StrictHostKeyChecking Failures

Fixing “Permission Denied (publickey)” Tracebacks

Bypassing Host Key Verification for Ephemeral Nodes

Frequently Asked Questions

How do I use different SSH keys for different Ansible inventory groups?

Why is Ansible ignoring my specific private key path?

Can I push SSH keys to managed nodes using an Ansible playbook itself?

Read Next

Analyzing and Fixing Failed to Template String Errors in Ansible Playbooks

Correcting Variable Precedence Conflicts in Complex Nested Ansible Inventories

Debugging Python Interpreter Discovery Failures on Legacy Managed Nodes

Fixing Ansible Become Sudo Password Errors and Permission Denied Failures

Resolving Unreachable Host Errors and SSH Connection Refused in Ansible Inventories

Configuring Persistent System Logging and Log Rotation via Ansible Playbooks

Designing Secure Credential Storage with Ansible Vault for Production Environments

Standardizing Package Management Across Hybrid Linux Environments Using Ansible Modules

Automating Multi-Node Web Server Provisioning with Ansible Roles and Handlers

Automating Multi-Node Web Server Provisioning with Ansible Roles and Handlers

Standardizing Package Management Across Hybrid Linux Environments Using Ansible Modules

Leave a Reply Cancel reply